Automatic Text Decomposition and Structuring
نویسندگان
چکیده
Sophisticated text similarity measurements are used to determine relationships between natural-language texts and text segments. The resulting linked hypertext maps are used to identify different text types and text structures, leading to improved text access and utilization. Examples of text decomposition are given for expository and non-expository texts. The vector processing model of retrieval has been used with substantial success to manipulate large collections of natural-language text. In vector processing, texts or text excerpts, as well as requests for information, are represented by sets of terms, or term vectors. Collectively the terms assigned to a given text are used to represent text content. Substantially identical methods are usable for determining collection structure (by comparing pairs of text vectors with each other and identifying text pairs found to be suuciently similar), and for retrieving information (by comparing query vectors with the vectors representing the stored items and retrieving items found to be similar to the queries). The results of a similarity computation between a query vector and the stored document vectors can be ranked in decreasing order of the computed query similarity. This makes
منابع مشابه
Automatic Structuring of Written Texts
This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the implemented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e. headings, signa...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملA symbolic approach to automatic multiword term structuring
This paper presents a three-level structuring of multiword terms (MWTs) basing on lexical inclusion, WordNet similarity and a clustering approach. Term clustering by automatic data analysis methods offers an interesting way of organizing a domain’s knowledge structures, useful for several information-oriented tasks like science and technology watch, textmining, computer-assisted ontology popula...
متن کاملSemi-Automatic Terminology Ontology Learning Based on Topic Modeling
Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application. However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or da...
متن کاملAutomatic continuity of almost multiplicative maps between Frechet algebras
For Fr$acute{mathbf{text{e}}}$chet algebras $(A, (p_n))$ and $(B, (q_n))$, a linear map $T:Arightarrow B$ is textit{almost multiplicative} with respect to $(p_n)$ and $(q_n)$, if there exists $varepsilongeq 0$ such that $q_n(Tab - Ta Tb)leq varepsilon p_n(a) p_n(b),$ for all $n in mathbb{N}$, $a, b in A$, and it is called textit{weakly almost multiplicative} with respect to $(p_n)$ and $(q_n)$...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Manage.
دوره 32 شماره
صفحات -
تاریخ انتشار 1994